BUG: rolling with MSVC 2017 build #21813

chris-b1 · 2018-07-08T16:02:42Z

closes .rolling().std() only returns NaN in Python3.7 #21786
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

For now just trying 3.7 on appveyor, not sure if all necessary conda packages are there

jreback · 2018-07-08T16:03:57Z

appveyor.yml

+      PYTHON_VERSION: "3.7"
+      PYTHON_ARCH: "64"
+      CONDA_PY: "37"
+      CONDA_NPY: "113"


can you make this 114?

this also very likely won't work, very little is built for 3.7 :<

TomAugspurger · 2018-07-08T16:40:06Z

The c3i_test channel may have some missing packages.

codecov · 2018-07-08T19:25:45Z

Codecov Report

Merging #21813 into master will increase coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #21813      +/-   ##
==========================================
+ Coverage   91.99%      92%   +<.01%     
==========================================
  Files         167      170       +3     
  Lines       50578    50553      -25     
==========================================
- Hits        46530    46510      -20     
+ Misses       4048     4043       -5

Flag	Coverage Δ
#multiple	`90.4% <ø> (ø)`	⬆️
#single	`42.21% <ø> (+0.04%)`	⬆️

Impacted Files	Coverage Δ
pandas/io/formats/console.py	`65.15% <0%> (-4.55%)`	⬇️
pandas/core/frame.py	`97.19% <0%> (ø)`	⬆️
pandas/core/generic.py	`96.47% <0%> (ø)`	⬆️
pandas/core/internals/blocks.py	`94.46% <0%> (ø)`
pandas/core/internals/concat.py	`98.37% <0%> (ø)`
pandas/core/internals/managers.py	`96.52% <0%> (ø)`
pandas/core/indexes/base.py	`96.37% <0%> (ø)`	⬆️
pandas/core/arrays/categorical.py	`95.95% <0%> (ø)`	⬆️
pandas/io/formats/excel.py	`97.39% <0%> (+0.01%)`	⬆️
pandas/core/reshape/merge.py	`94.16% <0%> (+0.05%)`	⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0828c25...5d40eae. Read the comment docs.

chris-b1 · 2018-07-09T14:41:31Z

Well, good news is we got a repro of this on our CI - it's not a python 3.7 issue, but instead a Visual Studio 2017 one.
https://ci.appveyor.com/project/pandas-dev/pandas/build/1.0.15961/job/r9axp415no6jubnr#L2059

Still not convinced this isn't a compiler bug, but having trouble to get a simple case to reproduce - one option would be to build our wheels VS 2015 - but annoying, b/c python forces the newest version, so can't have 2017 installed. I'll keep trying some things.

chris-b1 · 2018-07-11T00:02:55Z

This is kind of horrible, but it does fix the problem (just a lint issue I pushed for) and I'm out of ideas - any thoughts or suggestions @jreback, @TomAugspurger ?

jreback · 2018-07-11T00:07:03Z

pandas/_libs/window.pyx

@@ -654,16 +654,21 @@ cdef inline void add_var(double val, double *nobs, double *mean_x,
                         double *ssqdm_x) nogil:
    """ add a value from the var calc """
    cdef double delta
-


doing

if val != val: return # rest of code

might work (and be more clear here)

Great suggestion, sadly didn't work

TomAugspurger · 2018-07-12T12:52:48Z

@chris-b1 could you summarize the issue? You think MSVC is just optimizing away if val == val, which isn't true when val is NaN?

chris-b1 · 2018-07-12T16:27:12Z

That's right, optimizing maybe not the right word, but somehow the add_var function is being turned into a no-op, the mean_x, ssqdm_x, and nobs variables don't get updated. Adding any kind of unconditional action to the function fixes it - originally discovered by noticing it works with a print statement thrown in.

And yes, val==val is a not-a-NaN check. We use it all over the place in this module, so it's not as if that check is universally being treated wrong, but something about this particular context makes MSVC think it's unreachable or something.

TomAugspurger · 2018-07-12T16:30:25Z

Ok, thanks...

@jjhelmus do we know anyone at Microsoft who works on their compilers that might be able to provide insight on #21813 (comment) and #21813 (comment)?

jjhelmus · 2018-07-13T14:49:21Z

@zooba might be able to connect you to the right team at Microsoft

zooba · 2018-07-13T17:49:49Z

I can try, but I haven't been having as much luck recently.

Since this is Cython, can someone post the generated code for the comparison? There's a much better chance of getting something useful if they're looking at the code they actually compile, and not the code someone else is compiling.

chris-b1 · 2018-07-13T18:38:08Z

It's Cython, so of course the bigger file is a tangled mess, but the particular function is easy to read - lines of gist here.
https://gist.github.com/chris-b1/cd974fb6b4165285a48b6d6c12129b40#file-window-cpp-L11028-L11096

Called in a few places, but for instance, here is a problem one.
https://gist.github.com/chris-b1/cd974fb6b4165285a48b6d6c12129b40#file-window-cpp-L11948

Cython isn't doing anything particular interesting - it's a cdef function so basically translates line for line into c++.

I can post it later, not on this machine, but I've also tried to write a simpler, plain c++ example that triggers the problem, but it always worked - really not sure why.

zooba · 2018-07-13T22:18:45Z

I can reproduce it with this snippet (has a few tricks to try and defeat optimisations, but they are likely unnecessary):

#include <math.h>
#include <stdio.h>

double get_value(int i) {
    return i > 1 ? NAN : 7;
}

int check_default(double v1, double v2) {
    return ((v1 == v2) != 0);
}

#pragma float_control(precise, off, push)
int check_fast(double v1, double v2) {
    return ((v1 == v2) != 0);
}
#pragma float_control(pop)


int main(int argc, char **argv) {
    double x1 = get_value(argc);
    double x2 = get_value(argc);
    fprintf(stdout, "default: %d\nfast:    %d\n",
        check_default(x1, x2), check_fast(x1, x2)
    );
}

The comparison doesn't behave properly when you disable precise floating point operations. You may be doing this in the build with /fp:fast (if you build this snippet with that option, both checks return '1'; if you build with the default /fp:precise, it works). In fact, the docs for this flag specifically call out NaN comparisons as something that only work in precise mode.

So I'd guess you need to update the compiler flags for this one, or possibly add an explicit isnan() call.

jbrockmendel · 2018-07-14T04:03:18Z

Is the MSVC info something that can be inferred from platform.uname (at compile-time)?

chris-b1 · 2018-07-15T18:44:08Z

Thanks for taking a look at this @zooba - unfortunately that's not exactly the same problem I'm seeing here, the compiled code is acting as if (v1 == v2) always evaluates to False/0. We are building with /fp:precise (via defaults, but I've also tried manually overriding)

chris-b1 · 2018-07-15T18:48:16Z

Interestingly using isnan from <math.h> does solve the problem here - still feels like MSVC is doing something wrong in the former case, but that's a reasonable enough workaround.

zooba · 2018-07-16T15:29:28Z

the compiled code is acting as if (v1 == v2) always evaluates to False/0

Do you have a disassembly showing this? This would be very strange to see as a compiler optimization (unless you mean (v == v) always evaluates to True/1?)

b/c python forces the newest version, so can't have 2017 installed

Just saw this while rereading the thread. If you start with all the tools on PATH and set DISTUTILS_USE_SDK=1 then it'll skip the autodetection and use whatever you provide. (And if you help encourage the ecosystem to get off distutils completely, then more of these problems go away :) )

jbrockmendel · 2018-07-16T15:33:59Z

Could this SO answer be relevant?

https://stackoverflow.com/questions/8200311/checking-for-nan-in-cython/20031818#20031818

chris-b1 · 2018-07-16T15:44:47Z

@zooba - do you know how to pull symbols into the dissambly from a .pyd? I was playing around with dumpbin / compile flags, but never got anything better than, e.g. below, where everything is still in offsets.

 dumpbin.exe /DISASM pandas\_libs\window.cp36-win_amd64.pyd /OUT:tmp.txt

chris-b1 · 2018-07-16T15:56:00Z

do you know how to pull symbols into the dissambly from a .pyd

Answering my own question - I was missing '/DEBUG' on extra_link_args

chris-b1 · 2018-07-16T16:42:35Z

Alright @zooba - I think I have this to a reasonable bit of a dis-assembly - pushing the bounds of my knowledge but looks like a code-generation bug.

Here's MSVC 2015 (printfs surround a call to add_var) which works correctly. In particular the highlighted lines are the NaN check.
https://gist.github.com/chris-b1/1049fdabafc56b2a25de7ad377016baa#file-msvc2015-asm-L6-L8

And here is MSVC 2017. The function call doesn't get inlined - half of the NaN check gets done (jn) - then the stack is being set up, then the other half the check (jne). I believe in setting up the stack, the ZF flag originally set by ucomisd gets set back to 0 (from the sub) so the jne check will always be true, and the function unconditionally exits early.
https://gist.github.com/chris-b1/199f818c1fbb8f8c4108aa9e430df4f7#file-msvc2017-asm-L15-L19

zooba · 2018-07-16T16:47:18Z

@chris-b1 Yep, you're absolutely right. Thanks for tracking that down, I'll pass it on!

zooba · 2018-07-16T16:54:47Z

Here is the public side of the report: https://developercommunity.visualstudio.com/content/problem/294290/incorrect-codegen-for-double-equality.html Feel free to add pings there whenever you like

chris-b1 · 2018-07-16T23:23:31Z

Thanks @zooba!

and thanks @jbrockmendel - that's what I'll have to do to fix the 2.7 build, more modern MSVCs are basically standards compliant and have isnan where it should be.

jreback · 2018-07-17T12:30:05Z

does using the fix proposed in #21940 help here? e.g. npy_isnan?

chris-b1 · 2018-07-17T14:28:22Z

does using the fix proposed in #21940 help here? e.g. npy_isnan?

It does (latest commit has that and is passing on appveyor) - but less sure we want to apply that universally, I'll try perf on @jbrockmendel PR later, suspect it may be worse on Windows.

jreback · 2018-07-25T10:16:27Z

@chris-b1 this looks good. can you rebase once more.

perf looks ok? (even if its slightly worse this is a bug :>)

Go ahead and merge on green.

chris-b1 · 2018-07-26T17:32:22Z

@jreback thanks, yeah perf will be fine, just slightly slower on Windows for this particular function

lumberbot-app · 2018-07-26T17:32:50Z

There seem to be a conflict, please backport manually

* master: BENCH: asv csv reading benchmarks no longer read StringIO objects off the end (pandas-dev#21807) BUG: df.agg, df.transform and df.apply use different methods when axis=1 than when axis=0 (pandas-dev#21224) BUG: bug in GroupBy.count where arg minlength passed to np.bincount must be None for np<1.13 (pandas-dev#21957) CLN: Vbench to asv conversion script (pandas-dev#22089) consistent docstring (pandas-dev#22066) TST: skip pytables test with not-updated pytables conda package (pandas-dev#22099) CLN: Remove Legacy MultiIndex Index Compatibility (pandas-dev#21740) DOC: Reword doc for filepath_or_buffer in read_csv (pandas-dev#22058) BUG: rolling with MSVC 2017 build (pandas-dev#21813)

jreback · 2018-07-30T11:43:54Z

cc @chris-b1 do you want to see if you can backprot to 23.x branch?

* Appveyor 3.7 * ci package list * change image type * try hack fix * lint * use isnan on problem function * use numpy compat isnan * use right isnan * work around OSX math undefs * cleanup const * fix reversion * ... (cherry picked from commit 7a2fbce)

* Appveyor 3.7 * ci package list * change image type * try hack fix * lint * use isnan on problem function * use numpy compat isnan * use right isnan * work around OSX math undefs * cleanup const * fix reversion * ...

jreback reviewed Jul 8, 2018

View reviewed changes

chris-b1 changed the title ~~BUG: rolling with 3.7 and windows~~ BUG: rolling with MSVC 2017 build Jul 11, 2018

jreback reviewed Jul 11, 2018

View reviewed changes

jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Numeric Operations Arithmetic, Comparison, and Logical operations Windows Windows OS labels Jul 11, 2018

chris-b1 added 10 commits July 22, 2018 15:50

Appveyor 3.7

e6dde80

ci package list

e83d43b

change image type

9592b6d

try hack fix

2305a32

lint

cfc5c29

use isnan on problem function

02779e3

use numpy compat isnan

78de263

use right isnan

ff2f389

work around OSX math undefs

87e69e8

cleanup const

f0628c0

chris-b1 force-pushed the roll-var-37 branch from 8decc5a to f0628c0 Compare July 22, 2018 20:53

chris-b1 added 2 commits July 22, 2018 18:50

fix reversion

ec6fbe2

...

5d40eae

jreback added this to the 0.23.4 milestone Jul 25, 2018

chris-b1 merged commit 7a2fbce into pandas-dev:master Jul 26, 2018

lumberbot-app bot added the Still Needs Manual Backport label Jul 26, 2018

TomAugspurger mentioned this pull request Jul 30, 2018

RLS: 0.23.4 #22128

Closed

WillAyd added the Window rolling, ewma, expanding label Sep 4, 2018

TomAugspurger mentioned this pull request Oct 16, 2018

Windows CI #23182

Merged

chris-b1 mentioned this pull request Oct 17, 2018

PERF: replace use of isnan once MSVC bug is fixed #23209

Closed

TomAugspurger removed Still Needs Manual Backport labels Mar 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: rolling with MSVC 2017 build #21813

BUG: rolling with MSVC 2017 build #21813

chris-b1 commented Jul 8, 2018 •

edited

Loading

jreback Jul 8, 2018

jreback Jul 8, 2018

TomAugspurger commented Jul 8, 2018

codecov bot commented Jul 8, 2018 •

edited

Loading

chris-b1 commented Jul 9, 2018

chris-b1 commented Jul 11, 2018

jreback Jul 11, 2018

chris-b1 Jul 12, 2018

TomAugspurger commented Jul 12, 2018

chris-b1 commented Jul 12, 2018

TomAugspurger commented Jul 12, 2018

jjhelmus commented Jul 13, 2018

zooba commented Jul 13, 2018

chris-b1 commented Jul 13, 2018

zooba commented Jul 13, 2018

jbrockmendel commented Jul 14, 2018

chris-b1 commented Jul 15, 2018

chris-b1 commented Jul 15, 2018

zooba commented Jul 16, 2018

jbrockmendel commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

zooba commented Jul 16, 2018

zooba commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

jreback commented Jul 17, 2018

chris-b1 commented Jul 17, 2018

jreback commented Jul 25, 2018

chris-b1 commented Jul 26, 2018

lumberbot-app bot commented Jul 26, 2018

jreback commented Jul 30, 2018

BUG: rolling with MSVC 2017 build #21813

BUG: rolling with MSVC 2017 build #21813

Conversation

chris-b1 commented Jul 8, 2018 • edited Loading

jreback Jul 8, 2018

Choose a reason for hiding this comment

jreback Jul 8, 2018

Choose a reason for hiding this comment

TomAugspurger commented Jul 8, 2018

codecov bot commented Jul 8, 2018 • edited Loading

Codecov Report

chris-b1 commented Jul 9, 2018

chris-b1 commented Jul 11, 2018

jreback Jul 11, 2018

Choose a reason for hiding this comment

chris-b1 Jul 12, 2018

Choose a reason for hiding this comment

TomAugspurger commented Jul 12, 2018

chris-b1 commented Jul 12, 2018

TomAugspurger commented Jul 12, 2018

jjhelmus commented Jul 13, 2018

zooba commented Jul 13, 2018

chris-b1 commented Jul 13, 2018

zooba commented Jul 13, 2018

jbrockmendel commented Jul 14, 2018

chris-b1 commented Jul 15, 2018

chris-b1 commented Jul 15, 2018

zooba commented Jul 16, 2018

jbrockmendel commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

zooba commented Jul 16, 2018

zooba commented Jul 16, 2018

chris-b1 commented Jul 16, 2018

jreback commented Jul 17, 2018

chris-b1 commented Jul 17, 2018

jreback commented Jul 25, 2018

chris-b1 commented Jul 26, 2018

lumberbot-app bot commented Jul 26, 2018

jreback commented Jul 30, 2018

chris-b1 commented Jul 8, 2018 •

edited

Loading

codecov bot commented Jul 8, 2018 •

edited

Loading